NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / sys / amiga / programmer / 2783 < prev next >

Wrap

Internet Message Format | 1996-08-05 | 2.1 KB

Path: hydra.zrz.TU-Berlin.DE!rawneiha From: rawneiha@hydra.zrz.TU-Berlin.DE (Philipp Boerker) Newsgroups: comp.sys.amiga.programmer Subject: Re: TMapping again! Date: 5 Feb 1996 09:47:59 GMT Organization: Technical University of Berlin, Germany Message-ID: <4f4jof$h3b@news.cs.tu-berlin.de> References: <4d6v0t$3dt@maureen.teleport.com> <4dg4jk$km@news.cs.tu-berlin.de> <4dhvd5$5r2@maureen.teleport.com> <38232113@kone.fipnet.fi> <4e10ol$ck3@maureen.teleport.com> <4e2ku6$31m@news.cs.tu-berlin.de> <4eec27$pte@maureen.teleport.com> NNTP-Posting-Host: hydra.zrz.tu-berlin.de Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit sschaem@teleport.com (Stephan Schaem) writes: >Philipp Boerker (rawneiha@hydra.zrz.TU-Berlin.DE) wrote: >: sschaem@teleport.com (Stephan Schaem) writes: >: > repeat 8 >: > mw D1,D2 >: > mb D0,D2 >: > addx.l d7,D0 >: > movea.l d2,a0 >: > addx.l d6,D1 >: > mw (A0),d3 >: > mw D1,D2 >: > mb D0,D2 >: > movea.l d2,a0 >: > mb (A0),d3 >: > addx.l d7,D0 >: > addx.l d6,D1 >: > mw d3,(a1)+ >: > endr >: I think mapping 2 pixels like you did is not optimal. >: [...] > 'proper' pipelining... or maximum overlape of bus and sequencer > activity for my test is as above. I didn't count paper cycles, > but saw my fps get improved when I do the above VS 2 move.b ,(a1)+ > (BTW notice the instruction register usage, and the ordering. should > be optimal for a 060 and take the best advantage of overlap in the > case of a 2 move.b to mem version) The ordering can still be optimized for 060: mw d1,d2 & mb d0,d2 have an data dependency. You could put one of the addx's in between. > I agree about doing word read can cross long boundary and require 2 > access... But if its a problem on other usage of the loop above > Its so simple to make it write to (a1)+ vs d3. Have you tried to do mb (a0),d3 lsl.w #8,d3 instead of mw (a0),d3 ? May be it is faster. > Stephan Greets, Phil. grond/matrix